Data Structures & Algorithms Essentials

Internalize how data organization + algorithmic paradigms shape performance. Focus on mental models, complexity trade‑offs, and implementation patterns rather than rote memorization.

🧠 Big-O
🧵 Recursion
🗂️ Selection
🔍 Searching
🪜 Traversal
⚡ Optimization

1. Core Complexity Concepts

Asymptotic reasoning is a comparative tool, not a stopwatch.

Asymptotic Notation

  • O(f(n)) Upper bound (worst growth ceiling)
  • Ω(f(n)) Lower bound (best guarantee)
  • Θ(f(n)) Tight bound (sandwiched)
  • Little-o Strictly lower growth (f=o(g))
  • Amortized Average per operation over sequence

Growth Ladder (Typical)

  • O(1) → O(log n) → O(n) → O(n log n) → O(n²) → O(2^n) → O(n!)
  • Logarithms often appear with divide & conquer or tree height
  • n log n is balanced partition + linear combine cost

Space vs Time Trade-off

  • Caching / memoization: trade memory for fewer recomputations
  • Precomputation (prefix sums, sparse tables) speeds queries
  • Compression / streaming reduces memory at CPU cost

Data Structure Selection Heuristics

NeedBest CandidatesAvoid WhenNotes
Fast membership testHash Set / Bloom FilterAdversarial keys (hash attacks)Bloom = probabilistic (false positives)
Sorted iterationBalanced BST / Heap (partial)Random-only accessHeaps only guarantee top element
Min/Max streamingHeap / Monotonic QueueNeed arbitrary removalMonotonic queue for sliding windows
FIFO multi-producerQueue / Ring BufferRandom deletions requiredRing buffer for fixed capacity
LRU cacheHash Map + Doubly ListMassive key space w/ low reuseO(1) get + put eviction
Optimization mindset: measure → hypothesize → change one variable → re‑measure. Choose structure by workload, not preference.

2. Core Data Structures

Understand shape, operations, and cost model.

Array

Contiguous memory, O(1) index, expensive middle inserts (O(n)). Great for iteration & cache locality.

O(1) accessresize cost

Linked List

Node chain w/ pointers. O(1) head insert; O(n) index; pointer overhead, poor cache locality.

O(1) headfragmented

Stack

LIFO: push/pop O(1). Backtracking, parsing, DFS recursion emulation. Array or linked list backed.

call frames

Queue

FIFO: enqueue/dequeue O(1). Scheduling, BFS. Use circular buffer to avoid shifts.

breadth

Hash Table

Average O(1) put/get; degrade to O(n) worst if collisions unmitigated. Good hashing & resizing crucial.

amortized

Binary Search Tree

Ordered structure. Average O(log n) ops; unbalanced = O(n). Self-balancing (AVL, Red-Black) maintain log height.

ordered

Heap (Binary)

Partial order: root min/max. Insert & extract O(log n). Useful for priority scheduling & streaming kth.

priority

Graph

Entities (nodes) + relations (edges). Choose adjacency list vs matrix based on density.

networks

Trie (Prefix Tree)

String keyed; O(L) lookup by length. Memory heavy; compress with radix / DAWG.

strings

3. Algorithmic Paradigms

Patterns > isolated tricks. Generalize transformations.

Sorting Spectrum

  • Comparison Lower Bound: Ω(n log n)
  • Bubbles / Selection / Insertion: O(n²), tiny arrays ok
  • Merge: stable, O(n log n), O(n) extra space
  • Quick: avg O(n log n), worst O(n²) pivot risk
  • Heap: O(n log n), in-place, not stable
  • Counting/Radix: digit/key domain assumptions

Searching Patterns

  • Linear scan unsorted O(n)
  • Binary search sorted O(log n)
  • Two pointers (sorted or partition problems)
  • Sliding window for subarray substring constraints
  • Binary search on answer (feasibility function)

Recursion & Divide

  • Split → Solve → Combine (merge sort)
  • Tail recursion optimization mental model
  • Backtracking: explore + undo (N-Queens, subsets)
  • Pruning reduces exponential blow-up

Dynamic Programming

  • Optimal substructure + overlapping subproblems
  • Top-down memo vs bottom-up table
  • State definition clarity > code golf
  • Space optimization (rolling arrays)

Greedy Choice

  • Local optimum leads to global optimum (exchange argument)
  • Counterexample search before committing
  • Activity selection, interval scheduling

Graph Traversal

  • BFS: shortest unweighted path
  • DFS: cycle detect, topological order (post-order)
  • Dijkstra: weighted non-negative
  • Union-Find: connectivity queries

Classic Recurrence Examples

// Merge sort: T(n) = 2T(n/2) + O(n) => O(n log n)
// Binary search: T(n) = T(n/2) + O(1) => O(log n)
// Tower of Hanoi: T(n) = 2T(n-1) + 1 => O(2^n)
// Master Theorem forms aid direct classification

4. Interactive Practice Lab

Experiment with parameters to build intuition.

Complexity Growth Estimator

(enter n)

Sorting Step Visualizer

(steps)

BFS / DFS Runner

(order)

5. Complexity Cheat Sheet

Big-O (average unless noted).

StructureAccessSearchInsertDelete
ArrayO(1)O(n)O(n)O(n)
Dynamic Array (amort.)O(1)O(n)O(1)O(n)
Singly Linked ListO(n)O(n)O(1)*O(1)*
Hash Table-O(1)O(1)O(1)
BST (balanced)O(log n)O(log n)O(log n)O(log n)
HeapO(n)O(n)O(log n)O(log n)
TrieO(L)O(L)O(L)O(L)
*Assumes head reference (for insert/delete). Hash worst-case degrades to O(n) with poor distribution.

6. Review & Mastery Prep

Track your progress deliberately.

Progress Checklist

Concept Q&A

Why is quicksort often faster than merge sort despite same O(n log n)?

In-place (better cache locality), lower constant factors, no extra O(n) memory. Pivot quality influences performance.

Difference between recursion and backtracking?

Backtracking is structured recursion where you explore a decision space, undo (backtrack) partial choices to search alternatives systematically.

When to use a heap over sorting entire data?

When you only need top k elements or a streaming running min/max. Reduces cost from sorting all n to maintaining size k (O(n log k)).

Dynamic programming vs memoization?

DP often refers to bottom-up table fill order; memoization is top-down caching. Both exploit overlapping subproblems; choice influenced by natural recurrence direction & need for full table.